Optimal algorithms for computing the Robinson and Foulds topologic distance between two trees and the strict consensus trees of k trees given their distance matrices
نویسنده
چکیده
It has been postulated that existing species have been linked in the past in a way that can be described using an additive tree structure. Any such tree structure reflecting species relationships is associated with a matrix of distances between the species considered and called a distance matrix or a tree metric matrix. A circular order of elements of X corresponds to a circular (clockwise) scanning of the subset X of vertices of a tree drawing on the plane. This paper describes an optimal algorithm using circular orders to compare the topology of two trees given by their distance matrices. This algorithm allows us to compute the Robinson and Foulds topologic distance between two trees. It employs circular order tree reconstruction to compute an ordered bipartition table of the tree edges for both given distance matrices. These bipartition tables are then compared to determine the Robinson and Foulds topologic distance, known to be an important criterion of tree similarity. Described algorithm has optimal time complexity, requiring O(n2) time when performed on two n×n distance matrices. It can be generalized to get another optimal algorithm, which enables the strict consensus tree of k unrooted trees, given their distance matrices, to be computed in O(kn2) time.
منابع مشابه
Comparison of Additive Trees Using Circular Orders
It has been postulated that existing species have been linked in the past in a way that can be described using an additive tree structure. Any such tree structure reflecting species relationships is associated with a matrix of distances between the species considered which is called a distance matrix or a tree metric matrix. A circular order of elements of X corresponds to a circular (clockwise...
متن کاملFast Hashing Algorithms to Summarize Large Collections of Evolutionary Trees
Different phylogenetic methods often yield different inferred trees for the same set of organisms. Moreover, a single phylogenetic approach (such as a Bayesian analysis) can produce many trees. Consensus trees and topological distance matrices are often used to summarize the evolutionary relationships among the trees of interest. These summarization techniques are implemented in current phyloge...
متن کاملAlgorithms for Computing Cluster Dissimilarity between Rooted Phyloge- netic Trees
Phylogenetic trees represent the historical evolutionary relationships between different species or organisms. Creating and maintaining a repository of phylogenetic trees is one of the major objectives of molecular evolution studies. One way of mining phylogenetic information databases would be to compare the trees by using a tree comparison measure. Presented here are a new dissimilarity measu...
متن کاملA practical O(n log n) time algorithm for computing the triplet distance on binary trees
The triplet distance is a distance measure that compares two rooted trees on the same set of leaves by enumerating all sub-sets of three leaves and counting how often the induced topologies of the tree are equal or different. We present an algorithm that computes the triplet distance between two rooted binary trees in time O ( n log n ) . The algorithm is related to an algorithm for computing t...
متن کاملA Randomized Algorithm for Comparing Sets of Phylogenetic Trees
Phylogenetic analysis often produce a large number of candidate evolutionary trees, each a hypothesis of the ”true” tree. Post-processing techniques such as strict consensus trees are widely used to summarize the evolutionary relationships into a single tree. However, valuable information is lost during the summarization process. A more elementary step is produce estimates of the topological di...
متن کامل